Achieving k-Anonymity for Associative Classification in Incremental-data Scenarios
نویسندگان
چکیده
When a data mining model is to be developed, one of the most important issues is preserving the privacy of the input data. In this paper, we address the problem of data transformation to preserve the privacy with regard to a data mining technique, associative classification, in an incremental-data scenario. We propose an incremental polynomialtime algorithm to transform the data to meet a privacy standard, i.e. k -Anonymity. While the transformation can still preserve the quality to build the associative classification model. The computational complexity of the proposed incremental algorithm ranges from O(n log n) to O(△n) depending on the increment data. The experiments have been conducted to evaluate the proposed work comparing with a non-incremental algorithm. From the experiment result, the proposed incremental algorithm is more efficient in every problem setting.
منابع مشابه
Privacy-preserving data mining: A feature set partitioning approach
In privacy-preserving data mining (PPDM), a widely used method for achieving data mining goals while preserving privacy is based on k-anonymity. This method, which protects subject-specific sensitive data by anonymizing it before it is released for data mining, demands that every tuple in the released table should be indistinguishable from no fewer than k subjects. The most common approach for ...
متن کاملModification of the Fast Global K-means Using a Fuzzy Relation with Application in Microarray Data Analysis
Recognizing genes with distinctive expression levels can help in prevention, diagnosis and treatment of the diseases at the genomic level. In this paper, fast Global k-means (fast GKM) is developed for clustering the gene expression datasets. Fast GKM is a significant improvement of the k-means clustering method. It is an incremental clustering method which starts with one cluster. Iteratively ...
متن کاملA Survey of Privacy Preserving Data Publishing using Generalization and Suppression
Nowadays, information sharing as an indispensable part appears in our vision, bringing about a mass of discussions about methods and techniques of privacy preserving data publishing which are regarded as strong guarantee to avoid information disclosure and protect individuals’ privacy. Recent work focuses on proposing different anonymity algorithms for varying data publishing scenarios to satis...
متن کاملQuery Processing with K-Anonymity
Anonymization techniques are used to ensure the privacy preservation of the data owners, especially for personal and sensitive data. While in most cases, data reside inside the database management system; most of the proposed anonymization techniques operate on and anonymize isolated datasets stored outside the DBMS. Hence, most of the desired functionalities of the DBMS are lost, e.g., consist...
متن کاملProtecting Privacy by Multi-dimensional K-anonymity
Privacy protection for incremental data has a great effect on data availability and practicality. Kanonymity is an important approach to protect data privacy in data publishing scenario. However, it is a NP-hard problem for optimal k-anonymity on dataset with multiple attributes. Most partitions in k-anonymity at present are single-dimensional. Now research on k-anonymity mainly focuses on gett...
متن کامل